Goto

Collaborating Authors

 ica model





A Pseudo-Euclidean Iteration for Optimal Recovery in Noisy ICA

Neural Information Processing Systems

Independent Component Analysis (ICA) is a popular model for blind signal separation. The ICA model assumes that a number of independent source signals are linearly mixed to form the observed signals. We propose a new algorithm, PEGI (for pseudo-Euclidean Gradient Iteration), for provable model recovery for ICA with Gaussian noise. The main technical innovation of the algorithm is to use a fixed point iteration in a pseudo-Euclidean (indefinite "inner product") space.


On the Identifiability of Nonlinear ICA: Sparsity and Beyond

arXiv.org Artificial Intelligence

Nonlinear independent component analysis (ICA) aims to recover the underlying independent latent sources from their observable nonlinear mixtures. How to make the nonlinear ICA model identifiable up to certain trivial indeterminacies is a long-standing problem in unsupervised learning. Recent breakthroughs reformulate the standard independence assumption of sources as conditional independence given some auxiliary variables (e.g., class labels and/or domain/time indexes) as weak supervision or inductive bias. However, nonlinear ICA with unconditional priors cannot benefit from such developments. We explore an alternative path and consider only assumptions on the mixing process, such as Structural Sparsity. We show that under specific instantiations of such constraints, the independent latent sources can be identified from their nonlinear mixtures up to a permutation and a component-wise transformation, thus achieving nontrivial identifiability of nonlinear ICA without auxiliary variables. We provide estimation methods and validate the theoretical results experimentally. The results on image data suggest that our conditions may hold in a number of practical data generating processes.


Binary Independent Component Analysis via Non-stationarity

arXiv.org Machine Learning

We consider independent component analysis of binary data. While fundamental in practice, this case has been much less developed than ICA for continuous data. We start by assuming a linear mixing model in a continuous-valued latent space, followed by a binary observation model. Importantly, we assume that the sources are non-stationary; this is necessary since any non-Gaussianity would essentially be destroyed by the binarization. Interestingly, the model allows for closed-form likelihood by employing the cumulative distribution function of the multivariate Gaussian distribution. In stark contrast to the continuous-valued case, we prove non-identifiability of the model with few observed variables; our empirical results imply identifiability when the number of observed variables is higher. We present a practical method for binary ICA that uses only pairwise marginals, which are faster to compute than the full multivariate likelihood.


Compressive Independent Component Analysis: Theory and Algorithms

arXiv.org Machine Learning

In recent years, the size of datasets have grown exponentially as a result of advances in technology, signal acquisition, and the sophistication of modern day mobile phones and devices. This has enabled researchers, statisticians and machine learning practitioners to build increasingly accurate models as a consequence of larger sample sizes and feature dimensions. Nevertheless, this poses a fundamental challenge to large scale learning as (i) traditional algorithms have computational complexity that scales with the order of the dataset dimensions (ii) the whole dataset has to be stored or transferred on to local RAM as optimisation methods need to return to the data (or a random subset of the data) at subsequent iterations, and (iii) one is vulnerable to malicious attacks of potentially sensitive and personal information as the data needs to be stored or transferred locally. Compressive learning (CL) [1, 2] partially addresses these fundamental challenges by severely compressing the whole dataset into a random representation of fixed size, named a so-called sketch, in a single (or limited) pass of the data prior to learning. Once the sketch is formed, the parameters of the model are inferred solely from the sketch, hence a CL algorithm, for a given task or model, needs never to return to the original dataset, and it can be deleted from memory as a result. At the core of the CL framework [1, 3], is that in general, the size of the sketch does not scale with the dimensions of the dataset, or indeed the data's underlying dimensionality, but instead is driven by the complexity or dimensionality of the task or model of interest.


Learning Bijective Feature Maps for Linear ICA

arXiv.org Machine Learning

Separating high-dimensional data like images into independent latent factors remains an open research problem. Here we develop a method that jointly learns a linear independent component analysis (ICA) model with non-linear bijective feature maps. By combining these two methods, ICA can learn interpretable latent structure for images. For non-square ICA, where we assume the number of sources is less than the dimensionality of data, we achieve better unsupervised latent factor discovery than flow-based models and linear ICA. This performance scales to large image datasets such as CelebA.


groupICA: Independent component analysis for grouped data

arXiv.org Machine Learning

We introduce groupICA, a novel independent component analysis (ICA) algorithm which decomposes linearly mixed multivariate observations into independent components that are corrupted (and rendered dependent) by hidden group-wise confounding. It extends the ordinary ICA model in a theoretically sound and explicit way to incorporate group-wise (or environment-wise) structure in data and hence provides a justified alternative to the use of ICA on data blindly pooled across groups. In addition to our theoretical framework, we explain its causal interpretation and motivation, provide an efficient estimation procedure and prove identifiability of the unmixing matrix under mild assumptions. Finally, we illustrate the performance and robustness of our method on simulated data and run experiments on publicly available EEG datasets demonstrating the applicability to real-world scenarios. We provide a scikit-learn compatible pip-installable Python package groupICA as well as R and Matlab implementations accompanied by a documentation and an audible example at https://sweichwald.de/groupICA.


Heavy-Tailed Analogues of the Covariance Matrix for ICA

AAAI Conferences

Independent Component Analysis (ICA) is the problem of learning a square matrix A, given samples of X = A S, where S is a random vector with independent coordinates. Most existing algorithms are provably efficient only when each S i has finite and moderately valued fourth moment. However, there are practical applications where this assumption need not be true, such as speech and finance. Algorithms have been proposed for heavy-tailed ICA, but they are not practical, using random walks and the full power of the ellipsoid algorithm multiple times. The main contributions of this paper are (1) A practical algorithm for heavy-tailed ICA that we call HTICA. We provide theoretical guarantees and show that it outperforms other algorithms in some heavy-tailed regimes, both on real and synthetic data. Like the current state-of-the-art, the new algorithm is based on the centroid body (a first moment analogue of the covariance matrix). Unlike the state-of-the-art, our algorithm is practically efficient. To achieve this, we use explicit analytic representations of the centroid body, which bypasses the use of the ellipsoid method and random walks. (2) We study how heavy tails affect different ICA algorithms, including HTICA. Somewhat surprisingly, we show that some algorithms that use the covariance matrix or higher moments can successfully solve a range of ICA instances with infinite second moment. We study this theoretically and experimentally, with both synthetic and real-world heavy-tailed data.